Integration of cache-based model and topic dependent class model with soft clustering and soft voting

نویسندگان

  • Welly Naptali
  • Masatoshi Tsuchiya
  • Seiichi Nakagawa
چکیده

A topic dependent class (TDC) [1] language model (LM) is a topic-based LM that uses a semantic extraction method to reveal latent topic information from nouns relation. Then a clustering for a given context is performed to define topics. Finally, a fixed window of word history is observed to decide the topic of the current event through voting in online manner. Previously, we have shown that TDC overperforms several state-of-the-art baselines. There are two separate points that we would like to introduce in this paper. First, we improves the TDC further by incorporating cache-based LM through unigram scaling. The combination is possible since TDC only tried to capture topical words, and does not models re-occurring words, such as functional words, very well. Experiments on Wall Street Journal (WSJ) and Japanese newspaper (Mainichi Shimbun) corpora show that this combination improves the model significantly in terms of perplexity. Second, TDC stand-alone model suffers from shrinking training corpus size when the number of topics is increased. We solved this problem by performing softclustering and soft-voting on the training and test phase. Experiments result on WSJ corpus shows that TDC performance over perform the baseline without being needed to be interpolated with the word-based n-gram.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stability analysis and feedback control of T-S fuzzy hyperbolic delay model for a class of nonlinear systems with time-varying delay

In this paper, a new T-S fuzzy hyperbolic delay model for a class of nonlinear systems with time-varying delay, is presented to address the problems of stability analysis and feedback control. Fuzzy controller is designed based on the parallel distributed compensation (PDC), and with a new Lyapunov function, delay dependent asymptotic stability conditions of the closed-loop system are derived v...

متن کامل

Modeling and Estimating the Dimensions of Stable Alluvial Channels using Soft Calculations

In this research, soft computational models including multiple adaptive spline regression model (MARS) and data group classification model (GMDH) were used to estimate the geometric dimensions of stable alluvial channels including channel surface width (w), flow depth (h), and longitudinal slope (S) and the results of the developed models were compared with the multilayer neural network (MLP) m...

متن کامل

Application of Soft Computing Methods for the Estimation of Roadheader Performance from Schmidt Hammer Rebound Values

Estimation of roadheader performance is one of the main topics in determining the economics of underground excavation projects. The poor performance estimation of roadheader scan leads to costly contractual claims. In this paper, the application of soft computing methods for data analysis called adaptive neuro-fuzzy inference system- subtractive clustering method (ANFIS-SCM) and artificial  neu...

متن کامل

Developing a Local Model of Leadership Based on Soft Power in Iran Sport Federations

The aim of this study was to develop a local model of leadership based on soft power in Iran sport federations. This study had a qualitative approach and the method of grounded theory was used as the research methodology. The data were collected by library resources, field observation, audio media and in-depth and open interviews with 23 elite experts. The validity of this study was investigate...

متن کامل

A multi-criteria vehicle routing problem with soft time windows by simulated annealing

This paper presents a multi-criteria vehicle routing problem with soft time windows (VRPSTW) to mini-mize fleet cost, routes cost, and violation of soft time windows penalty. In this case, the fleet is heterogene-ous. The VRPSTW consists of a number of constraints in which vehicles are allowed to serve customers out of the desirable time window by a penalty. It is assumed that this relaxation a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010